Audio signal quality enhancement in a digital network

ABSTRACT

The invention relates to a network element ( 1 ) and a method for enhancing the quality of digitised analogue signals transmitted in parameterised coded form via a digital network. In order to enable an enhancement of the quality of the digitised analogue signals on network side, the network element comprises means ( 20, 21 ) for extracting signals from and insert signals into the network, first processing means ( 24 ) for processing the extracted parameters in the parameter domain with functions suitable to enhance the quality of the digitised analogue signals and second processing means ( 26 ) for processing the extracted parameters in the linear domain with functions suitable to enhance the quality of the digitised analogue signals. Moreover included analysing and selecting means ( 23, 27 ) determine the expected enhancement of quality in the different processing domains and cause a corresponding insertion of processed signals back into the network. The proposed method comprises corresponding steps.

FIELD OF THE INVENTION

The invention relates to a network element and a method for enhancingthe quality of digitised analogue signals transmitted in parameterisedcoded form via a digital network.

BACKGROUND OF THE INVENTION

Digital networks like packet based IP (Internet Protocol) networks orTDM (Time Division Multiplex) based networks are employed to transmitnot only signals traffic but also digitised analogue signals, inparticular audio signals like speech and video.

Before an digitised analogue signal can be transmitted by the digitalnetwork, an analogue-to-digital conversion of the signal has to becarried out. Further, the signal is usually compressed, e.g. with aratio of 8:1 or 4:1, to allow a low bit rate access to the core networkand for capacity savings within the core network itself.

When transferring voice between two IP terminals, for example, thespeech is converted and compressed by an encoder in the source terminalto form parameterised coded digitised analogue signals and decompressedand reconverted by a decoder in the destination terminal and vice versa.

The quality of the speech presented to an enduser at the respectivesource terminal depends on a variety of factors.

A first group of factors is network related and comprises delay, lostpackets etc. on the transmission route.

A second group of factors is terminal related and comprises the qualityof the microphone, the loudspeakers, the A/D converter, the automaticlevel control, the echo canceller, the noise suppressor etc. A furtherterminal related factor is the surroundings of the terminal, likeenvironmental noise. Beside the different quality of employed speechenhancement features or services, some of the terminals might even lackcompletely certain speech enhancement features or services which wouldbe useful to increase the satisfaction of the enduser.

A third group of factors appears when several networks are involved inone transmission, e.g. when an IP terminal inter-works with the PSTN(Public Switched Telephone Network) or a mobile access network. In sucha case, additional degradations may result from echo from PSTN hybridsor from acoustic noise from mobile terminals etc. IP-PSTN gateways areutilised to enable the inter-working between the IP network and the PSTNor the mobile access network. These gateways may include features forenhancing the quality of the speech they transmit.

However, some gateways are lacking important speech enhancementfeatures.

In digital networks, usually nothing is done to compensate for theterminal or the network transition specific factors on the network side.

For GSM (Global System for Mobile communication) networks, the ETSI(European Telecommunication Standards Institution) TFO (Tandem FreeOperation) specifies how multiple encoding and decoding, especially atgateways and switches, can be avoided. When complying with the TFOmodel, a transmitted TFO stream includes parameterised coded speech thatgoes end-to-end in the speech parameter domain. The end-points may betwo mobiles or a mobile and an IP-terminal via a gateway. Two IPterminals interconnected only by an IP network involve a TFO by nature.The same principles are valid for the GPRS (General Packet RadioService) and the third generation networks where the speech may stay allthe way in the packet based network. Exemplary routes of the latter are:MS-BS-RNC-SGSN-GGSN-IP terminal or MS-BS-PCU-SGSN-GGSN-IP terminal (MS:Mobile Station; BS: Base Station; RNC: Radio Network Controller; SGSN:Serving GPRS Support Node; GGSN: Gateway GPRS Support Node; PCU: PacketControl Unit). However, until end-to-end TFO connections are realised inall networks, the transition factors influencing the quality oftransmitted digitised analogue signals still have to be considered andthe terminal specific factors are not affected by the TFO approachanyhow.

In the whole, it would be beneficial if digital networks provided meansfor enhancing the quality of digitised analogue signals. Multipleencoding and decoding, however, should be avoided for quality reasons.

For packet based networks, ITU-T specification H.323 (07/2000)introduces a multipoint processor (MP) used for conference calls. Themultipoint processor prepares N-audio outputs from M-audio inputs byswitching and/or mixing. For mixing, the input audio signals are decodedto linear signals on which a linear combination is performed. Theresulting signal is encoded again to the appropriate audio format. It isproposed that the multipoint processor moreover eliminates or attenuatessome of the input signals in order to reduce noise and other unwantedsignals.

This means, however, that an additional decoding and encoding step isintroduced as well, which should be avoided for the sake of the qualityof the audio signal as mentioned above and of a small processing delay.

SUMMARY OF THE INVENTION

It is an object of the invention to provide a network element and amethod that allow for a satisfactory enhancement of the quality ofdigitised analogue signals transmitted via a digital network on thenetwork side.

On the one hand, this object is reached by a network element forenhancing the quality of digitised analogue signals transmitted at leastin parameterised coded form via a digital network to which the networkelement has access, comprising: a payload extraction block forextracting coded digitised analogue signals from the digital network,which coded digitised analogue signals include at least in partparameterised coded digitised analogue signals; first processing meansfor processing the extracted parameterised coded digitised analoguesignals in the parameter domain with functions suitable to enhance thequality of the digitised analogue signals; second processing means forprocessing at least part of the extracted coded digitised analoguesignals in the linear domain with functions suitable to enhance thequality of the digitised analogue signals; a payload insertion block forinserting processed coded digitised analogue signals to the digitalnetwork; and analysing and selecting means for determining the qualityimprovement of the digitised analogue signals resulting from aprocessing of the extracted coded digitised analogue signals in theparameter domain and from a processing of the extracted coded digitisedanalogue signals in the linear domain and for causing that at leastcoded digitised analogue signals processed by the processing meansleading to the better improvement are inserted back to the digitalnetwork by the payload insertion block.

On the other hand, the object is reached by a method for enhancing thequality of digitised analogue signals transmitted at least inparameterised coded form via a digital network, comprising:

-   -   extracting coded digitised analogue signals from the digital        network, which coded digitised analogue signals include at least        in part parameterised coded digitised analogue signals;    -   determining the quality improvement of the digitised analogue        signals to be expected by a processing of the extracted encoded        digitised analogue signals in the parameter domain and by a        processing of the extracted encoded digitised analogue signals        in the linear domain;    -   processing the extracted parameterised coded digitised analogue        signals in the parameter domain at least if a greater quality        improvement is expected by processing in the parameter domain,        with functions suitable for enhancing the quality of digitised        analogue signals; and    -   processing at least part of the extracted coded digitised        analogue signals in the linear domain at least if a greater        quality improvement is expected by processing in the linear        domain, with functions suitable for enhancing the quality of        digitised analogue signals; and    -   inserting at least those processed coded digitised analogue        signals to the digital network that were processed in the        domain, the processing in which was expected to result in a        greater quality improvement.

By including a possibility for processing transmitted coded digitisedanalogue signals not only in the linear domain but also in the parameterdomain, the network element and the method according to the inventionenable an optimal enhancement of the quality of digitised analoguesignals on the network side.

The analysing and selecting means of the network of the inventiondetermine, whether linear and/or parameter domain processing should beused by analysing whether linear or parameter domain processing resultsin a better quality improvement of the digitised analogue signals. Acorresponding step is provided in the method of the invention. Forexample, if parameter domain processing is not technically feasible forthe enhancement of the signal quality, linear processing is expected toresult in a better quality enhancement. If the processing in theparameter domain is possible, the expected quality enhancement isdetermined for both kinds of processing and the selection is based on acomparison of the expected enhancements.

In case that a processing of extracted signals in the parameter domainis expected to lead to a better enhancement of the quality of thedigitised analogue signal, at least signals processed in the parameterdomain are inserted to the network again. In case that a processing ofextracted signals in the linear domain is expected to lead to a betterenhancement of the quality of the digitised analogue signal, onlysignals processed in the linear domain are inserted to the networkagain.

In the case that the processing in the parameter domain is expected tolead to better results, signals processed in the linear domain shouldonly be inserted to the network in addition to signals processed in theparameter domain, if the processing in the linear domain leads to alarger processing delay because of necessary time consuming pre- andaftertreatments. This way, it is possible to dispensed with thedisadvantageous additional decoding and encoding of the extractedsignals necessary before processing parameterised coded digitisedanalogue signals in the linear domain. No additional decoding andencoding of the signals means a better quality of the digitised analoguesignals and at the same time less processing delay. For example,parameterised coded digitised analogue signals transmitted via packetbased networks, as well as coded digitised analogue signals transmittedin the TFO stream in a TDM based network require decoding before andencoding after processing in the linear domain, while coded digitisedanalogue signals transmitted in the PCM stream in a TDM based networkrequire only a-law or μ-law to linear conversions and vice versa forlinear processing.

While the signals to be inserted to the network again are selectedaccording to the expected quality improvement, a processing in bothdomains can be carried out in any case, if the processed signals are tobe evaluated for determining which processing is expected to lead to abetter result. In case that only signals processed in the parameterdomain are to be inserted to the network again, this insertion can becarried out before the processing in the linear domain is completed. Thesignals processed in the linear domain are then used as soon as they areready for determining the future expected quality improvements by linearprocessing.

Preferred embodiments of the invention become apparent from thesubclaims.

The analysing and selecting means of the network of the invention canbase its decision whether a processing in the parameter domain or in thelinear domain is to be carried out on an analysis of incoming parameterdomain data, like parameters for gains. Alternatively or additionally,it can base the decision on measurements, like voice level,signal-to-noise ration and presence of echo, carried out in the lineardomain after decoding. Preferably, the measurements and the selectionare made before and after the input data is processed in the linear andin the parameter domain. The selection of the processing domain can thenbe made by comparing the measurements to fixed thresholds that suggesteither the linear or parameter domain processing. The numerical valuesfor the thresholds can be derived by performing e.g. real listeningtests with varying test input data that is processed and assessed inboth domains.

As several factors affect the choice of the processing domain, it may bedifficult to formulate threshold patterns that result in the bestchoices in all call conditions. Therefore, in a further preferredembodiment, a neural network based approach is used for selecting theprocessing domain that is expected to bring the better results. Incomingparameter domain data and results from measurements after decoding canbe used as the input for the neural network of N neurons. Weights orcoefficients for the neurons can be derived by training the network withappropriate test data and outputs from real listening tests.

The processing means for processing in the parameter domain and theprocessing means for processing in the linear domain may include avariety of functions. Echo cancellation, noise reduction and levelcontrol are possible functions for both, processing in the parameter andin the linear domain. In addition, transcoding and speech mixing asconference bridge are at least possible functions for processing in theparameter domain.

For example, for a gain control in the parameter domain, the gainparameters of the extracted parameterised coded digitised analoguesignals can be compared with a desired gain for forming correspondingnew gain parameters. The desired gain parameters can be pre-set, inputby the user or calculated out of the received gain parameters. The newgain parameters are then inserted into the extracted parameterised codeddigitised analogue signals, thus substituting the original gainparameters.

In order to achieve a noise suppression by processing in the parameterdomain, a processing in the time domain or in the frequency domain,preferably in both, is carried out. In the time domain, noise portionsand low level signal portions of the extracted parameterised codeddigitised analogue signals are attenuated and corresponding gainparameters are inserted in the extracted parameterised coded digitisedanalogue signals, thus replacing the original gain parameters. In thefrequency domain, frequency portions of noise in the extractedparameterised coded digitised analogue signals which have approximatelythe same energy as the noise estimate are attenuated. Correspondinglinear prediction parameters are then inserted to the extractedparameterised coded digitised analogue signals, thus replacing theoriginal linear prediction parameters.

For echo suppression in the parameter domain, parameterised codeddigitised analogue signals are extracted from both transmissiondirections. The signals can then be compared in order to detect echoesin the first parameterised coded digitised analogue signals. Portions ofthe first parameterised coded digitised analogue signal are replaced bycomfort noise portions, if an echo was determined in the portion of thefirst parameterised coded digitised analogue signal. The echo signal canalso first be attenuated and then, the residual echo signal issuppressed. It is proposed to include a possibility for by-passing thefirst parameterised coded digitised analogue signals without echocompensation, if there is no signal activity in the opposite directionor if the signal level of the extracted parameterised coded digitisedanalogue signals is below a threshold level in the opposite direction.

In a preferred embodiment of the invention, a bad frame handler block isincluded in the network element. This block may work together with thepayload extraction block and the processing means for detecting missingframes, e.g. from RTP (Real Time Protocol) numbers, for regeneratingmissing blocks, e.g. by using interpolation techniques or copyingprevious frames, and for reordering frames in disorder within abuffering window. A suitable location for the bad frame handler block isimmediately after the payload extraction block.

In a further preferred embodiment of the invention, the network elementcomprises analysing means for determining whether any processing is tobe applied to the extracted parameterised coded digitised analoguesignals and for selecting the functions that are to be applied toextracted coded digitised analogue signals in the parameter domainand/or the linear domain. Those functions can be included in theanalysing and selecting means used for determining the qualityimprovement expected by a processing in the parameter domain and by aprocessing in the linear domain.

In case no processing is deemed to be necessary, the coded digitisedanalogue signals can simply pass one or both of the processing meanswithout any processing being carried out.

The choice can be taken by the analysing means autonomously by analysingthe received coded digitised analogue signals and possibly by analysingalready processed signals. Alternatively or additionally, the choice maydepend on an external control signal. Even if an external control signalis employed and does not ask for any processing to be carried out, theanalysing means can evaluate the quality of the received parameterisedcoded digitised analogue signals, e.g. with regard to speech level,existence of echo, signal-to-noise ratio, and select one or severalprocessing functions. The external control signal can enter the networkelement via a control block in the network element, which may be conformto the specified H.248 protocol, and indicates for example that there isalready an echo canceller on the connection and that therefore thereceived parameterised coded digitised analogue signals can be forwardedwithout echo cancellation by the processing means. The control block canalso have a direct access to the processing means for selecting theprocessing functions that are to be carried out by itself.

Selection of the most suitable functions to be employed is also apreferred feature of the method according to the invention.

The digital network involved may be either packet based, like IP-, UDP-(User Datagram Protocol) or RTP- (Real Time Protocol) networks, or TDMbased. Still, any other digital network transmitting parameterised codeddigitised analogue signals can be accessed as well. When referring inthis specification to an IP network, this includes any IP-, UDP- orRTP-network.

In a packet based network, the digitised analogue signals are onlytransmitted as parameterised coded digitised analogue signals. In a TDMbased networks, employed e.g. for GSM, the digitised analogue signalscan be transmitted as parameterised coded digitised analogue signals ina TFO stream and simultaneously in a PCM (Pulse Code Modulation) streamas a-law or μ-law coded G.711 PCM samples.

Accordingly, in one preferred alternative, the payload extraction blockis suitable to extract parameterised encoded digitised analogue signalsfrom an IP stack of a packet-based network and the payload insertionblock is suitable to insert parameterised encoded digitised analoguesignals to said IP stack of the packet-based network.

In another preferred alternative, the payload extraction block issuitable to extract a TFO stream and, if desired, in addition a PCMstream from the timeslots of a TDM based network. In the latter case,the two streams are separated in the payload extraction box for furtherprocessing, and the payload insertion block is suitable to combine asupplied TFO stream with a supplied PCM stream again and to insert thecombined stream to said TDM based network. If the payload insertionmeans is only provided with a PCM stream, however, it can also insertonly this PCM stream back to said TDM based network again.

In GSM-PCM, the payload extraction block can take only the TFO stream asinput or alternatively the TFO stream and the PCM stream, which are thenseparated in the payload extraction block.

An extracted TFO stream that is inserted to the digital network againhas either been processed in the parameter domain or in the lineardomain with a decoding before and an encoding after the linearprocessing. Which kind of TFO stream is inserted should depend on theachieved or achievable quality improvement of the included digitisedanalogue signal. In addition, the TFO stream processed after decoding inthe linear domain should be transformed without prior encoding into aPCM stream that is combined with the selected encoded TFO streams forinsertion into the digital network. However, in case no TFO stream isavailable at the payload extraction means or in case the TFO stream isstopped, the PCM stream can be extracted and processed in the lineardomain and output to the digital network via the payload insertion meansby itself.

Alternatively, the TFO stream can be processed in the parameter domainand the PCM stream, which does not have to be decoded for linearprocessing, can be processed in parallel in the linear domain. In casethe TFO stream is only processed if it is expected to lead to a betterresult than the processing of the PCM stream, the TFO stream is notnecessarily included in the data inserted to the network again when notprocessed.

The network element according to the invention can be located freelybeside or inside any other network element. In a packet based network,the network element of the invention is preferably co-located with abroadband IP node, which leads to minimal processing delays.

The network element and the method of the invention can be used for theenhancement of the quality of any digitised analogue signals transmittedby a digital network in parameterised coded form. It is of particularrelevance for transmitted speech, but also e.g. for video.

BRIEF DESCRIPTION OF THE FIGURES

In the following, the invention is explained in more detail withreference to drawings, of which

FIG. 1 shows the integration of the network element according to theinvention in an IP-network;

FIG. 2 shows a first embodiment of the network element according to theinvention;

FIG. 3 shows a second embodiment of the network element according to theinvention;

FIG. 4 shows a third embodiment of the network element according to theinvention;

FIG. 5 shows a block diagram of an embodiment of a parameter domain gaincontrol;

FIG. 6 shows a block diagram of an embodiment of a parameter domainnoise suppression;

FIG. 7 shows a block diagram of an embodiment of a parameter domain echosuppression; and

FIG. 8 shows a block diagram of an embodiment of a parameter domain echocancellation.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 shows the environment of a network element 1 according to theinvention.

A fist terminal 2 is connected via an IP network with a second terminal3. Both terminals 2, 3 can be IP phones. At some place in the IPnetwork, there is an IP router forming a broadband IP node 4. Co-locatedwith and connected to this network node 4, there is a network element 1according to the invention.

Network element 1 operates in the speech parameter domain and is able toperform signal processing functions for parameterised coded speech. Theavailable functions are echo cancellation, noise reduction, gaincontrol, conference bridge and bad frame handling. Possibilities forrealising some of those functions will be described later with referenceto FIGS. 5 to 8.

Parameterised coded speech passes from the first terminal 2 to thenetwork node 4. They are forwarded from the network node 4 to thenetwork element 1, which carries out the appropriate functions in thespeech parameter domain. Then, the processed parameterised coded speechis sent back to the network node 4 which forwards them to theirdestination, the second terminal 3.

FIG. 2 shows the different elements comprised in an embodiment of thenetwork element 1 of FIG. 1.

A payload extraction block 20 and a payload insertion block 21 formtogether the interface of the network element 1 to the network node 4.Within the network element 1, the payload extraction block 20 isconnected via a bad frame handler block 22 to an analyser and selectorblock 23. The two outputs of the analyser and selector block 23 areconnected on the one hand to first processing means 24 and on the otherhand via a speech decoding block 25 to second processing means 26. Eachof the processing means 24, 26 comprises a function for echocancellation, for noise reduction and for level control. The output ofthe first processing means 24 is connected to the input of a selector27. The output of the second processing means 26 is equally connected tothe input of the selector 27, but via a speech encoding block 28. Theoutput of the selector 27 is input to the payload insertion block 21.Finally, there is a control block 29, e.g. an H.248 protocol controlblock, which receives as input a control signal generated externally ofthe network element 1 and the output of which is connected to theanalyser and selector block 23.

The network element 1 functions as follows:

The payload extraction block 20 extracts the payload, i.e. parameterisedcoded speech, from the IP stack of the network node 4 of FIG. 1. Thespeech parameters are checked by the bad frame handler block 22. Here,missing frames are detected and regenerated by using interpolationtechniques. Moreover, frames in disorder are reordered within abuffering window. The processed signals are then forwarded to theanalyser and selector block 23.

The analyser and selector block 23 analyses the speech parameters anddetermines whether a processing in the linear domain or in the parameterdomain would lead to a better result and which of the availablefunctions should be applied. If parameter domain processing is nottechnically feasible for the speech enhancement, linear processing isselected. The analyser and selector block 23 can also determine that noprocessing at all needs to be carried out. The analyser and selectorblock 23 receives in addition external information via the control block29, indicating for example whether there is already an echo canceller onthe connection so that a further echo cancellation is not necessary.

If no processing or a processing in the parameter domain was selected,the analyser and selector block 23 outputs the encoded speech to thefirst processing means 24, which applies all selected functions to theparameterised coded speech in the parameter domain.

If a processing in the linear domain was supposed to be necessary, theanalyser and selector block 23 outputs the parameterised coded speech tothe speech decoding block 25. The speech decoding block 25 decodes thecoded speech, which may be suitable for GSM FR (Full Rate), to form alinear signal. The linear speech signal is then input to the secondprocessing means 26, which applies all selected functions to the linearspeech signal in the linear domain. After processing, the linear speechsignal is input to the speech encoding block 28, which encodes thelinear speech signal to form parameterised coded speech suitable for GSMFR again.

The selector 27 receives the output signals of the speech encoding block28 and of the first processing means 24 and is moreover controlled bythe analyser and selector block 23. Therefore, the selector 27 is ableto determine, whether the signals from the first processing means 24 orthe signals from the speech encoding block 28 constitute processed codedspeech and to forward the respective signals to the payload insertionblock 21. The selector 27 can moreover support the work of the analyserand selector block 23 by providing information about processed signals.

In the payload insertion block, the parameterised coded speech isinserted back as payload to the IP stack of the network node 4, fromwhere it is forwarded to its destination 3.

In the whole, an enhancement of the quality of speech can be achieved,while additional decoding and encoding is only carried out if necessary.A superfluous decrease in the speech quality is therefore avoided andthe processing delay is kept low by the processing in the parameterdomain. Since the network element 1 is co-located with the broadband IPnode 4, processing delays are further minimised.

FIG. 3 schematically illustrates another embodiment of the networkelement of the invention. The embodiment is similar to the firstembodiment of the network element, but it is employed for processing ofencoded speed parameters received from a network node in a TDM basednetwork, which is used for GSM TFO.

Equal to the network element of FIG. 2, the network element of FIG. 3comprises a payload extraction block 30, a bad frame handler 32, ananalyser and selector block 33, a decoding block 35, first and secondprocessing means 34, 36, an encoding block 38, a payload insertion block31 and a H.248 control block 39. Both processing means 34, 36 compriseagain functions for echo cancellation, noise reduction and levelcontrol. The elements are connected to each other in the same way as inFIG. 2. In contrast to the network element of FIG. 2, however, insteadof a selector block 27, a second analyser and selector block 37 isintegrated between the encoding block 38 and the payload insertion block31. Moreover, the output of the second processing means 36 is not onlyconnected to the encoding block 38, but also directly to the payloadinsertion block 31.

The network element of the second embodiment functions as follows:

The signal entering the payload extraction block 30 from a network nodecontains a G.711 PCM stream of 48 or 56 kbps in the most significantbits and GSM TFO encoded speech parameters at 16 or 8 kbps in the leastsignificant bits. In the payload extracting block 30, the TFO stream isseparated from the PCM stream. Only the TFO stream is forwarded to thebad frame handler block 32, where it is treated as described for thetreatment of the parameterised coded speech in the embodiment of FIG. 2.

After the bad frame handling, the TFO stream is inputted to the analyserand selector block 33. The analyser and selector block 33 forwards theTFO stream on the one hand to the first processing means 34, where thestream is processed in the parameter domain. On the other hand, theanalyser and selector block 33 forwards the TFO stream to the decodingmeans 35, where a speech decoding, e.g. again a GMS FR to lineardecoding, is carried out. The decoded TFO stream is then inputted to thesecond processing means 36, where it is processed in the linear domain.For both processing means 34, 36, the functions to be applied are chosenin the first analyser and selector means 33 according to an externalcontrol signal entering the network element via the control block 39.

The output of the first processing means 34 fed to the analyser andselector block 37. The output of the second processing means 36 isspeech encoded again in the encoding means, e.g. linear to GSM FRencoding, and fed to the second analyser and selector block 37 as well.

The first analyser and selector block 33 and the second analyser andselector block 37 work together for determining which processing, theone in the parameter domain or the one in the linear domain, results ina better voice quality.

In case that parameter processing of the TFO stream is determined toresult in a better voice quality than linear processing of the decodedTFO stream, only the TFO stream coming from the first processing means34 is forwarded by the second analyser and selector block 37 to thepayload inserting means 31. In case that linear processing of thedecoded TFO stream is determined to result in a better voice qualitythan parameter processing of the TFO stream, only the TFO stream comingfrom encoding block 38 is forwarded by the second analyser and selectorblock 37 to the payload inserting means 31.

Both paths can be working all the time so that a change between thedifferent modes, pure linear processing and parallel processing, can becarried out without discontinuities in the internal states of thedecoding means 25 and the encoding means 28.

The output of the second processing means 36 is forwarded in additionwithout any encoding directly to the payload insertion means 31. In thepayload insertion means 31, a PCM stream is formed out of the decodedand linearly processed TFO stream. The PCM stream and the selected codedTFO stream are then combined and inserted back into the TDM basednetwork for further transmission.

Thus, the speech quality of the digitised analogue signal in the outputPCM stream is improved by linear processing and the speech quality ofthe digitised analogue signal in the output TFO stream is improved byprocessing in the parameter domain or in the linear domain, depending onwhich processing leads to a better result.

If there is no TFO stream available in the signal extracted by thepayload extracting means 30, or if the TFO stream is stopped, apossibility is provided for conducting the PCM stream through the badframe handler 32 for frame related treatment and through the secondprocessing means 36 for processing in the linear domain. The passing ofa decoding block is not necessary, since the PCM stream does not containparameterised data. It should be noted, though, that linear processingof a G.711 PCM stream requires a-law or μ-law to linear conversions andvice versa. The processed PCM stream is then inserted to the digitalnetwork again by the payload insertion means 31.

FIG. 4 schematically illustrates a third embodiment of the networkelement of the invention constituting a second option for enhancing thequality of speech in a TDM based network used for GSM TFO.

In this example, a payload extracting block 40 is connected via a badframe handler block 42 directly to first and second processing means 44,46. Both processing means 44, 46 comprise again functions for echocancellation, noise reduction and level control. Also the outputs of thefirst and the second processing means 44, 46 are connected only directlyto inputs of the payload insertion block 41. A H.248 protocol controlblock 49 is present again.

The network element of the third embodiment functions as follows:

The PCM stream and the TFO stream entering the payload extraction block40 from a network node are separated by the payload extraction block 40as in the embodiment of FIG. 3. In this embodiment, however, both, theTFO stream an the PCM stream, are forwarded to the bad frame handlerblock 42 and treated there as explained with reference to FIGS. 2.

After the bad frame handling, the TFO stream is forwarded to the firstprocessing means 44, where it is processed in the parameter domain. Atthe same time, the PCM samples are forwarded to the second processingmeans 46. Since in this embodiment, only the PCM samples are processedby the processing means 46 working in the linear domain, a decodingblock is not necessary; as mentioned with regard to the embodiment ofFIG. 3 the PCM stream does not contain parameterised data. In bothprocessing means 44, 46, the functions to be applied are chosenaccording to an external control signal by means of the control block 49of the network element.

Thus, speech enhancement is carried out for both, the TFO stream and thePCM stream separately at the same time. In any case, the coded speech inthe TFO stream is not decoded for processing and encoded again.

The TFO stream and the PCM stream leaving the processing means 44, 46are combined in the payload insertion block 41 and inserted back intothe TDM based network for further transmission. It can be decided atsome other place of the network which one of the streams should be usedfor obtaining the best voice quality.

Each of the three described embodiments of the network element accordingto the invention allows for an enhancement of the quality ofparameterised speech or video on the network side with minimalprocessing delay. They can be located freely beside or inside anyexisting network element.

Now, different possibilities of processing in the parameter domain inthe first processing means 24, 34, 44 of one of FIGS. 2 to 4 will bedescribed with reference to FIGS. 5 to 8.

FIG. 5 shows a block diagram of a gain control device that can beintegrated in a first processing means of a network element according tothe invention for gain control in the parameter domain. An input line isconnected on the one hand to the input of a decoder 50 and on the otherhand to a first input of a gain parameter re-quantisation block 53. Thedecoder 50 is further connected directly and via a speech levelestimation block 51 to a linear-to-parameter domain mapping block 52.The output of the linear-to-parameter domain mapping block 52 isconnected to a second input of the gain parameter re-quantisation block53 which is connected in addition to an output line.

Incoming coded speech frames are forwarded to the decoder 50, where thecoded speech is linearised before being fed to the speech levelestimation block 51. The speech level estimation block 51 comprises aninternal voice activity detector (VAD) used for indicating whether thelevel estimate has to be updated, since it is desirable that in thespeech level estimate only the speech level is estimated.

In the speech level estimation block 51, a desired gain value iscalculated based on an estimated speech level and a predetermineddesired target speech level. The desired gain is fed to the first inputfor the linear-to-parameter domain mapping block 52.

The speech estimation block 51 is only needed for an automatic levelcontrol. In case a fixed gain control is to be used, possibly with auser settable gain, the decoder 50 and the speech estimation block 51can be omitted.

Further fed to the linear-to-parameter domain mapping block 52 aredecoded gain parameters of current speech frames of e.g. 20 ms or ofsub-frames of e.g. 5 ms, which decoded gain parameters are comingdirectly from the decoder 50. The decoded gain parameters are typicallyexcitation gain parameters of a code excited linear prediction (CELP)speech coder. These gain parameters typically consist of adaptive andfixed codebook gains, which are vector quantised for the transmission.Scalar values of these parameters can be obtained from internalintermediate values of the decoder 50.

In the linear-to-parameter domain mapping block 52, the linear desiredgain value is converted to appropriate new gain parameters of a speechcoder. A codebook based mapping is used for determining these new gainparameters for the current frame or sub-frame in order to achieve thedesired gain. The codebook is a three-dimensional table in whichadaptive codebook gain, fixed codebook gain and linear gain values formeach dimension. The new gain parameter values are read from the table assoon as all input values for the frame or sub-frame are known. Thistable is trained beforehand in a way that the errors between the newgain parameter values and the gain parameter values of gain scaled codedframes for each desired linear gain value are minimised. Alternatively,the mapping table could be trained by minimising the error between thedecoded re-quantised speech frame and a decoded gain scaled speechframe. The training requires several test sequences in order fully trainall elements within the mapping table.

In practical implementations it might be useful to compress the size ofthe table either by utilising redundancy in the data, by limiting lineargain values or by increasing the step size of input values. Anotherchoice is to find out a mathematical function, which approximates themapping function in such way that the performance is subjectivelyacceptable.

Finally, the new gain values are re-quantized for the transmission andthe original gain values are replaced with the new values in the gainparameter re-quantization block 53.

FIG. 6 shows a block diagram of a noise suppression device that may beintegrated in a first processing means of a network element according tothe invention for noise suppression in the parameter domain.

An input line is again connected on the one hand to the input of adecoder 60 and on the other hand to a first input of a gain parameterre-quantisation block 63. A first output of the decoder 60 is connectedvia a speech level estimation block 61, a VAD 66, a noise level andspectrum estimation block 64 and a short term signal level and spectrumcalculation block 65 to a block 67 for determining noise attenuationparameters. The output of the VAD 66 is moreover connected to an inputof the speech level estimation block 61 as well as to an input of thenoise level and spectrum estimation block 64.

A first output of the block 67 for determining noise attenuationparameters is connected to a first input of a spectrum-to-LP (linearprediction) mapping block 68 and a second output to a first input of alinear-to-parameter domain mapping block 62.

A second output of the decoder 60 is connected to a further input of thenoise level and spectrum estimation block 64 and of the short termsignal level and spectrum calculation block 65 and additionally to asecond input of the spectrum to LP mapping block 68. A third output ofthe decoder 60 is connected to a second input of the linear-to-parameterdomain mapping block 62.

The output of the linear-to-parameter domain mapping block 62 isconnected to a second input of the gain parameter re-quantisation block63, the output of which is in turn connected to a first input of a LPparameter re-quantisation block 69. The second input of this block 69 isconnected to the output of the spectrum-to-LP mapping block 68.

Finally, the output of the LP parameter re-quantisation block 69 isconnected to an output line.

The decoder 60, the speech level estimation block 61, thelinear-to-parameter domain gain mapping block 62 and the gain parameterre-quantisation block 63 can be identical or quite similar to thecorresponding blocks 50-53 of the example of FIG. 5.

In the example of FIG. 6, noise suppression can be achieved bytime-domain or frequency-domain parameter processing. Obviously bycombining both methods, the optimum performance can be obtained.

The time-domain processing is based on a dynamic processing in whichnoise portions and very low level speech portions are slightlyattenuated by a gain control function making use of the blocks 60-63corresponding to the blocks 50-53 of FIG. 5. The gain control istherefore carried out as explained above, only that block 67 is used forforwarding the speech level estimate received by block 61 to thelinear-to-parameter domain mapping block 62. This can be understood asan expanding function in parameter domain.

In the frequency-domain noise suppression, the frequency portions, whichhave more energy than speech, are attenuated. Traditionally, a lineartime-domain signal is first converted to the frequency-domain byutilising Fourier Transform or filter banks. Then, a spectralsubtraction can be applied to the frequency-domain signal. The amount ofsubtraction is based on a noise estimate, signal-to-noise ratio andpossible other parameters. Finally, the noise attenuated signal isconverted back to the time-domain. In this example, however, thefrequency-domain processing is carried out by re-shaping a LinearPrediction (LP) spectrum envelope of speech frames. This is explainedmore in detail in the following.

To achieve a high quality noise suppression, an accurate noise estimatehas to be modelled. In order to differentiate between speech and speechpauses, a voice activity detector 66 is employed, which outputs a speechflag “true”, when speech was detected and a speech flag “false” when aspeech pause is detected. The voice activity detector 66 needs to be ofhigh quality in order to get accurate VAD decisions even in lowsignal-to-noise ratio conditions, otherwise speech and noise estimateswill diverge. Basically, the speech level estimate is updated in thespeech level estimation block 61 when the speech flag is true, and noiselevel and spectrum estimates are updated in the noise level and spectrumestimation block 64 when the speech flag is false.

In block 64, the long term noise level and spectrum are estimated. Forthe long term noise spectrum estimate, Linear Prediction Coefficients(LPC) need to be decoded in the decoder 60 from the received speechframe. The LP coefficients are often converted to Line Spectral Pairs(LSP) by the encoder employed for encoding. In that case, the LPC valuescan be obtained from internal intermediate values of the decoder 60. Asthe LP coefficients define only the spectral envelope, the noise levelestimate is required to scale the LP spectral envelope, in order to forma power spectrum estimate of the noise. Alternatively, the LP spectralenvelope could be scaled by using excitation gain parameters of thereceived frame. As already mentioned above, the noise estimate isupdated only if the VAD flag is false.

A short-term signal level and spectrum is calculated for the receivedframe in the same manner as previously described in the short termsignal level and spectrum calculation block 65, except that no averagingor a fast averaging of previous frames is used for the levelcalculation. Typically, VAD decisions are not utilised.

The main intelligence of the algorithm lies in the block 67 fordetermining noise attenuation parameters. In this block 67,frequency-domain noise attenuation parameters (i.e. desired spectrumshaping) are selected according to the long term noise spectrum estimatereceived by block 64 and the received short term signal spectrumreceived by block 65. Accordingly, the desired time-domain gain is basedon the long term speech and noise, and short term signal levels.Moreover, VAD information received by the VAD 66 and long termsignal-to-noise ratio calculated from speech and noise level estimatesreceived from blocks 61 and 64 are utilised as extra information for thealgorithm of the block 67 for determining noise attenuation parameters.

In the spectrum shaping in block 67, the long term noise spectrumestimate is compared with the short term signal spectrum. A target framespectrum is shaped in such a way that those short term spectrum parts,which are quite close to long term spectrum, are slightly attenuated. Onthe other hand those parts, which are clearly above of long termspectrum, are left untouched because those parts likely contain speechinformation. Additionally, the frequency and temporal masking of humanauditory system can be utilised in frequency shaping. This means that ifsome parts of the spectrum lie within an auditory frequency maskingcurve, no frequency shaping is required for those parts. In temporalmasking no frequency shaping (or time-domain processing) is needed forthe current frame if one or more previous frames has contained higherspeech level which introduce temporal masking effect for lower levelsignals of the current frame. Using these rules results in lessdistortion to processed speech as less shaping is done.

Furthermore, the spectrum shaping can be controlled by the VAD flag insuch way that less shaping is applied if a speech pause was detected.The noise attenuation is then mainly achieved by gain processing duringspeech pauses by blocks 60-63. In addition, also the short term signallevel can control the amount of shaping. Namely, there is less shapingwith low level frames as the noise attenuation is partly handled withgain processing. Finally, the amount of spectrum shaping can depend onthe long term signal-to-noise ratio (SNR) in such way that less shapingis applied in high SNR in order to preserve high quality in noiselessspeech conditions.

As soon as the desired spectrum shaping is calculated for the currentframe, original LP coefficients have to be converted according to thedesired spectrum. This is carried out in the spectrum-to-LP mappingblock 68. The mapping can be realised again as codebook mapping by usingthe original LPC and the desired spectrum as input parameters.Alternatively, new LP coefficients could directly be calculated from thedesired spectrum by converting the spectrum to an LP spectrum envelopeand thereby converting it to LP coefficients.

Finally, in the LP parameter re-quantisation block 69, the new LPCparameters are quantised or converted to LSP parameters and the oldparameters are replaced with new ones in the coded frames.

As mentioned previously, a signal dynamics expanding function can beused together with the spectrum shaping or it can be even used alone. Ifit used alone, only a slight expansion is allowed as it might cause anoise modulation effect. Basically in expansion, the lower the signallevel is, the more attenuation is applied. The expansion threshold iscontrolled by the noise level estimate in such a way that the frame orsub-frame exceeding the noise level estimate is not attenuated.Furthermore, the VAD 66 can control the expansion in such a way thatslightly less expansion is utilised whenever the current frame is aspeech frame. Thereby the attenuation of low level speech phonemes canbe minimised.

As soon as the desired linear gain for the current frame or sub-frame isfound, the linear-to-parameter domain mapping and gain parameterre-quantisation can be carried out in blocks 62 and 63 as described withreference to the gain control. As a result, modified gain and LPCparameters are transmitted with other speech parameters over thetransmission media.

FIG. 7 shows a block diagram of an echo suppression device that can beintegrated in a first processing device of a network element accordingto the invention for echo suppression in the parameter domain.

A first input line is connected to a first decoder 70 and a second inputline is connected to a second decoder 71, both decoders 70, 71 beingconnected in turn to an echo analysis block 72. The output of the firstdecoder 70 is further connected via a noise estimation block 73, acomfort noise generation block 74 and an encoder 75 to one connection ofa switch 76. The switch 76 can either form a connection between theencoder 75 and an output line or between the first input line and theoutput line. The echo analysis block 72 has a controlling access to thisswitch 76.

In order to be able to determine if a signal transmitted from a near endto a far end comprises an echo and to be able to suppress or cancel suchan echo, signals from both transmission directions have to be analysed.Therefore, two decoders 70, 71 are employed for linearising signals fromthe near-end (point where echo is reflected back) as “send in” signalsand from the far-end as “receive in” signals respectively. It is easierand more accurate to carry out echo analysis in the linear domain. Inthe echo analysis block 72, the signal levels of the two linearisedsignals are estimated. If the level ratio of near and far-end signals islower than a threshold value, the near-end signal is considered as anecho and comfort noise is inserted to the signal that is to betransmitted to the far-end as “send out” signal. If there is an acousticecho, a special filtering can be used for far-end signal estimation toimprove the double talk performance of the echo suppression, asdescribed e.g. in document WO 9749196. In order to get the correctresult from the signal comparison, the echo path delay has to be known.If the delay is variable, a delay estimation might be needed to definethe correct delay value. A cross-correlation can be used for the delayestimation.

In the noise estimation block 73, an accurate noise estimate of thelinearised near-end signal received from the first decoder 70 is formed.Preferably, background noise is estimated in both, the level and thespectral domain. The estimation method can be the same as the methoddescribed for noise suppression. Equally, other methods can be used,e.g. methods based on filter banks or Fourier transformation.

The comfort noise is then generated in the comfort noise generationblock 74 by making use of the noise estimates received from the noiseestimation block 73. To generate the comfort noise, a level scaled whitenoise is fed through a synthesis filter which actually has theequivalent envelope spectrum as in the noise estimation block 73.Therefore the synthesis filter can be a LP filter or filter bank.

Finally, the generated comfort noise is encoded by the encoder 75 toform a frame or a sub-frame including an encoded comfort noiseparameter.

If an echo was manifested by the echo analysis block 72 for the currentsend in frame or sub-frame, the switch 76 is switched by the echoanalysis block 72 to connect the encoder 75 with the output line and thecurrent frame or sub-frame is replaced with generated encoded comfortnoise parameter. If no echo is manifested, the switch 76 keepsconnecting or is switched by the echo analysis block 72 to connect thefirst input line with the output line so that the original frame orsub-frame is forwarded to the output line without being replaced.

By using the described method, tandem speech coding can be avoided bothin speech and comfort noise frames and high quality speech can beprovided.

Alternatively and in order to save processing and memory resources, thespeech encoder can be omitted by generating comfort noise directly inthe parameter domain. In the parameter domain comfort noise generation,a long-term LP spectrum envelope of background noise is averaged asdescribed with reference to FIG. 6. Additionally, a long-term excitationgain parameter is averaged with the same updating principles as for theLP spectrum envelope updating, i.e. it is updated if the VAD flag isfalse. Typically only the fixed codebook gain value needs to be averagedas the adaptive codebook gain value is close to zero if there is noisetype of signal. As a comfort noise frame or sub-frame needs to betransmitted to the far-end, original LPC and excitation gain parametersare replaced with the averaged LPC and gain parameters. Moreover, theoriginal excitation pulses within the frame are replaced with randompulses which represent white noise in the parameter domain. Ifdiscontinuous transmission (DTX) is used in the send in direction,excitation pulses need not to be transmitted. Instead, only averaged LPCand gain parameters are transmitted in the silence description frame(SID) which is standardised for most of the speech codecs. Indiscontinuous transmission, random excitation pulses are generated atthe decoder end.

FIG. 8 shows a block diagram of an echo cancellation device that can beintegrated in first processing means of a network element according tothe invention for echo cancellation in the parameter domain.

A first input line is connected directly to a first decoder 80 and asecond input line is connected via a FIFO (first in first out) framememory 87 to a second decoder 81, both decoders 80, 81 being connectedin turn to an adaptive filter 82. The adaptive filter 82 is connected toan NLP and comfort noise generation block 84 and the first decoder isconnected to a second input of the same block 84 via a noise estimationblock 83. The output of the NLP and comfort noise generation block 84 isconnected via an encoder 85 to a switch 86. The switch 86 can eitherform a connection between the encoder 85 and an output line or betweenthe first input line and the output line. An output of the first decoder80, the second decoder 81 and the adaptive filter 82 are connected inaddition to inputs of a control logic 88. The control logic 88 hascontrolling access to the adaptive filter 82, the NLP and comfort noisegeneration block 84 and the switch 86.

The proposed echo cancellation is quite a similar to the above describedecho suppression. The adaptive filter 82 and the control logic 88 areincluded to lower the echo signal before a residual echo suppressionfunction is applied by a non-linear processor (NLP) 84. For the linearadaptive filtering, signals from both directions have to be linearisedby the local decoders 80, 81. As there are two speech codings for thereturning echo signal, cumulated non-linear distortions reduceremarkably the effectiveness of linear adaptive filtering. Therefore itmight be desirable to include a non-linear echo modelling within echocancellation, as described e.g. in document WO 9960720. Moreover, delaysintroduced into the echo path by speech codings, transmission or othersignal processing can be compensated by the FIFO frame memory block 87.Thus the amount of taps of the adaptive filter 82 can be reduced andless processing capacity is required.

The function of the noise estimation block 83 and the NLP and comfortnoise generation block 84 can be similar to the above described noisesuppression, although the control of the NLP 84 can be different as moreparameters, e.g. echo path model, achieved echo attenuation, send in,receive in and residual echo signals, can be utilised in the NLPdecision. This is handled within control logic block 88. The output ofthe NLP and comfort noise generation block 84 is encoded by the encoder85.

The switch 86 is provided for switching between speech frames receivedat the send in port and the encoded output of NLP/comfort noise block,i.e. the output of the send out port is either a bypassed send in frame(or sub-frame) or an echo cancelled frame (or sub-frame). A criterion ofthe selection could be as follows.

If there is no speech activity or if the signal level of the far-end islow enough, send in frames are bypassed. Otherwise the output of theNLP/comfort noise block 84 is chosen as output after encoding by theencoder 85. Therefore, a TFO stream is left untouched if only thenear-end talks or if there is silence in both directions. If the far-endtalks only, encoded comfort noise is inserted. If there is a double talkcondition, either comfort noise or output of the adaptive filter 82 ischosen for the send out signal. This depends on the state of NLP 84 andtypically varies during the double talk. A benefit of this method isthat there is a tandem free operation for the near-end signal most ofthe time. At the time instants when tandem coded frames are sent to thefar-end direction, double talk with the NLP block 84 is inactive.However, this is not subjectively more annoying compared to that ofconventional echo cancellation as the NLP switching already introducessome artefacts on near-end speech and because direct acoustic maskingand side-tone of the far-end diminish the audibility of NLP artefactsduring double talk.

Alternatively, in order to save processing and memory resources, theencoder could be omitted by generating comfort noise directly in theparameter domain as described with reference to FIG. 7.

1. An apparatus, comprising: a payload extraction block configured toextract coded digitised analogue signals from a digital network, whereinthe extracted coded digitised analogue signals comprise at least in partparameterised coded digitised analogue signals; a first processorconfigured to process the extracted parameterised coded digitisedanalogue signals in a parameter domain with functions suitable toenhance a quality of the extracted parameterised coded digitisedanalogue signals; a second processor configured to process at least partof the extracted coded digitised analogue signals in a linear domainwith functions suitable to enhance a quality of the extracted codeddigitised analogue signals; a payload insertion block configured toinsert the processed coded digitised analogue signals into the digitalnetwork; and an analyzer configured to determine a quality improvementof the digitised analogue signals resulting from a processing of theextracted parameterised coded digitised analogue signals in theparameter domain and from a processing of the extracted coded digitisedanalogue signals in the linear domain, and to determine which processingis capable of providing a better quality improvement, wherein thepayload insertion block is further configured to insert at least thecoded digitised analogue signals processed by the first processor or thesecond processor which lead to the quality improvement back into thedigital network.
 2. The apparatus according to claim 1, wherein thefunctions for processing the parameterised coded digitised analoguesignals by the first processor comprise at least one of echocancellation, noise reduction, or level control.
 3. The apparatusaccording to claim 1, wherein the functions for processing codeddigitised analogue signals by the second processor comprise at least oneof echo cancellation, noise reduction, level control, transcoding orspeech mixing.
 4. The apparatus according to claim 1, wherein theanalyzer is further configured to analyze the extracted digitisedanalogue signals before and after processing by the first and the secondprocessors to determine the better quality improvement.
 5. The apparatusaccording to claim 1, wherein the analyzer comprises a neural network.6. The apparatus according to claim 1, further comprising: a bad framehandler configured to detect in the extracted coded digitised analoguesignals at least one of missing frames or frames in disorder, andfurther configured to regenerate at least one of missing frames orreordering frames in disorder in the extracted coded digitised analoguesignals.
 7. The apparatus according to claim 1, wherein the analyzer isfurther configured to determine whether any processing is to be appliedto the extracted parameterised coded digitised analogue signals andfurther configured to select the functions to be applied to theextracted coded digitised analogue signals by at least one of the firstprocessor or the second processor based on at least one of the extractedcoded digitised analogue signals or an external control signal.
 8. Theapparatus according to claim 1, further comprising: a control blockconfigured to receive an external control signal and further configuredto control the selection of the processing applied to the extractedcoded digitised analogue signals directly or by the analyzer.
 9. Theapparatus according to claim 1, wherein the payload extraction block isfurther configured to extract parameterised coded digitised analoguesignals from an internet protocol stack of a packet-based network, andfurther configured to insert the extracted parameterised coded digitisedanalogue signals into the internet protocol stack of said packet-basednetwork.
 10. The apparatus according to claim 9, further comprising: adecoder configured to decode said extracted parameterised codeddigitised analogue signals and further configured to forward the decodedsignals to the second processor; an encoder configured to encode theprocessed coded digitised analogue signals output by the secondprocessor; a first selector configured to receive the extracted codeddigitised analogue signals from the payload extraction block and furtherconfigured to forward the extracted coded digitised analogue signalseither to the first processor or to the second processor via thedecoder; and a second selector configured to select as inputs theoutputs of the first processor and the outputs of the second processorand further configured to select which output is to be forwarded to thepayload insertion block, wherein the analyzer is further configured todetermine whether the extracted coded digitised analogue signals are tobe processed by the first processor or the second processor and furtherconfigured to control the first selector and the second selectoraccordingly.
 11. The apparatus according to claim 1, wherein the payloadextraction block is further configured to extract a tandem freeoperation stream and a pulse code modulation stream from a time divisionmultiplex based network, further configured to separate said tandem freeoperation stream from said pulse code modulation stream, furtherconfigured to provide the tandem free operation stream as parameterisedcoded digitised analogue signals, further configured to combine a tandemfree operation stream with a pulse code modulation stream, and furtherconfigured to insert the combined stream, or only a pulse codemodulation stream if no tandem free operation stream is provided to thepayload insertion block, to the time division multiplex based network.12. The apparatus according to claim 11, further comprising: a decoderconfigured to decode said tandem free operation stream and furtherconfigured to forward the decoded parameterised digitized analoguesignals to the second processor; an encoder configured to encode theprocessed digitized analogue signals output by the second processor; areceiver configured to receive the tandem free operation stream from thepayload extraction block and further configured to forward the tandemfree operation stream to the first processor and to the second processorvia the decoder; and a selector configured to receive as inputs theoutputs of the first processor and the outputs of the encoder andfurther configured to forward an output with the quality improvement ofthe coded digitised analogue signals to the payload insertion block,wherein the analyzer is further configured to determine whether aprocessing in the first processor or in the second processor results ina better quality improvement of the digitised analogue signals andfurther configured to control the selector accordingly, and wherein theselector is further configured to form a pulse code modulation streamout of an output of the second processor and further configured toforward said pulse code modulation stream to the payload insertionblock.
 13. The apparatus according to claim 12, wherein the payloadextraction block is further configured to provide the pulse codemodulation stream as non-parameterised coded digitised analogue signals,and wherein the selector is further configured to forward the extractedpulse code modulation stream to the second processor and furtherconfigured to forward the output of the second processor to the payloadinsertion block, if no tandem free operation stream is available forprocessing.
 14. The apparatus according to claim 11, wherein the payloadextraction block is further configured to provide the pulse codemodulation stream as non-parameterised coded digitised analogue signals,wherein the first processor is configured to process the tandem freeoperation stream in the parameter domain, wherein the second processoris configured to process the pulse code modulation stream in the lineardomain, and wherein at least the processed pulse code modulation streamis forwarded to the payload insertion block.
 15. The apparatus accordingto claim 1, wherein the first processor is further configured to comparegain parameters of the extracted parameterised coded digitised analoguesignals with a desired gain, further configured to form correspondingnew gain parameters, and further configured to replace original gainparameters with the new gain parameters in the extracted parameterisedcoded digitised analogue signals.
 16. The apparatus according to claim15, wherein the first processor comprises: a decoder configured tolinearise the extracted parameterised coded digitised analogue signalsand further configured to provide decoded gain parameters of thedigitised analogue signals; an estimator configured to estimate a levelof the linearised coded digitised analogue signal and further configuredto determine desired gain values based on the estimated level of thedigitised analogue signal and a desired target level of the digitisedanalogue signal; and a processor configured to determine from thedecoded gain parameters of the coded digitised analogue signal and thedesired gain values new gain parameters suitable for achieving thedesired gain by linear-to-parameter domain mapping, wherein theprocessor is further configured to re-quantise the new gain parametersand further configured to replace the original gain parameters with thenew gain parameters in the parameterised coded digitised analoguesignals.
 17. The apparatus according to claim 16, wherein the estimatorcomprises a voice activity detector configured to ensure that onlyspeech signals are estimated in the signal level estimate.
 18. Theapparatus according to claim 1, wherein the first processor is furtherconfigured to attenuate noise portions and low level signal portions ofthe extracted parameterised coded digitised analogue signals in the timedomain and further configured to correspondingly replace the gainparameters in the extracted parameterised coded digitised analoguesignals.
 19. The apparatus according to claim 1, wherein the firstprocessor is further configured to attenuate frequency portions of noisein the extracted parameterised coded digitised analogue signals whichhave approximately the same energy as a noise estimate and furtherconfigured to correspondingly replace linear prediction parameters inthe extracted parameterised coded digitised analogue signals.
 20. Theapparatus according to claim 19, wherein the first processor comprises:a decoder configured to decode linear prediction coefficients from theextracted coded digitised analogue signals; an estimator configured toestimate a long-term power spectrum of the noise of the digitisedanalogue signals and further configured to estimate a short-term powerspectrum of the noise of the digitised analogue signals; and a processorconfigured to determine a desired spectrum depending on the differencebetween a long-term spectrum and the short-term spectrum, furtherconfigured to determine new linear prediction coefficients according tothe desired spectrum, further configured to quantise the new linearprediction coefficient parameters or to convert them to line spectralpairs parameters, and further configured to replace them for the oldparameters in the extracted parameterised coded digitised analoguesignal.
 21. The apparatus according to claim 1, wherein the payloadextraction block is further configured to extract further codeddigitised analogue signals from the digital network transmitted in anopposite direction to the extracted coded digitised analogue signals,wherein the further coded digitised analogue signals comprise at leastin part parameterised coded digitised analogue signals; and wherein thefirst processor is further configured to compare first extractedparameterised coded digitised analogue signals and the further extractedparameterised coded digitised analogue signals to detect echoes in thefirst extracted parameterised coded digitised analogue signals andfurther configured to replace portions of the first extractedparameterised coded digitised analogue signals with comfort noiseportions, if an echo was determined in the portion of the firstextracted parameterised coded digitised analogue signals.
 22. Theapparatus according to claim 21, wherein the first processor comprises:a first decoder configured to linearise the extracted coded digitisedanalogue signals from a first direction; a second decoder configured tolinearise the further extracted coded digitised analogue signals from anopposite direction; an echo analzyer configured to detect an echo in aportion of the first extracted parameterised coded digitised analoguesignals from a first direction; and a generator configured to generatecomfort noise and further configured to replace an original portion ofthe first extracted parameterised coded digitised analogue signals fromthe first direction with corresponding comfort noise parameters in casean echo was detected.
 23. The apparatus according to claim 1, whereinthe payload extraction block is further configured to extract furthercoded digitised analogue signals from the digital network transmitted inthe opposite direction to the extracted coded digitised analoguesignals, wherein the further coded digitised analogue signals compriseat least in part parameterised coded digitised analogue signals; andwherein the first processor is further configured to attenuate an echosignal in the first extracted parameterised coded digitised analoguesignals making use of the further parameterised coded digitised analoguesignals and further configured to suppress a residual echo signal. 24.The apparatus according to claim 23, wherein the first processorcomprises: a first decoder configured to linearise the extracted codeddigitised analogue signals from a first direction; a second decoderconfigured to linearise the further extracted coded digitised analoguesignals from an opposite direction; an adaptive filter and a controllogic receiver configured to linearise signals from the first and thesecond decoder to attenuate echo signals in the linearised codeddigitised analogue signals received from the first decoder; a non linearprocessor configured to process residual echo suppression based onresidual echo signals received from the adaptive filter and furtherbased on a noise estimation of the linearised coded digitised analoguesignals from the first direction; and a generator configured to generatecomfort noise based on the residual echo suppression and furtherconfigured to replace an original portion of a first extractedparameterised coded digitised analogue signal with a correspondingcomfort noise parameter in case an echo was detected.
 25. The apparatusaccording to claim 21, wherein the first processor is further configuredto by-pass the first extracted parameterised coded digitised analoguesignals without processing, if there is no signal activity in theopposite direction or if the signal level of the extracted parameterisedcoded digitised analogue signals is below a threshold level in theopposite direction.
 26. The apparatus according to claim 1, wherein theextracted coded digitised analogue signals comprise at least one ofcoded speech or coded video.
 27. The apparatus according to claim 1,wherein the apparatus comprises a network element configured to enhancethe quality of the extracted coded digitised analogue signalstransmitted at least in parameterised coded form via the digital networkto which the network element has access.
 28. A method, comprising:extracting coded digitised analogue signals from a digital network,wherein the coded digitised analogue signals comprise parameterisedcoded digitised analogue signals; determining a quality improvement ofthe coded digitised analogue signals to be expected by a processing ofthe extracted coded digitised analogue signals in a parameter domain andby a processing of the extracted coded digitised analogue signals in alinear domain; processing the extracted parameterised coded digitisedanalogue signals in the parameter domain if a greater qualityimprovement is expected by the processing in the parameter domain withfunctions suitable for enhancing the quality of digitised analoguesignals; processing the extracted parameterised coded digitised analoguesignals in the linear domain if a greater quality improvement isexpected by processing in the linear domain; with functions suitable forenhancing the quality of digitised analogue signals; and inserting theprocessed parameterised coded digitised analogue signals into thedigital network that were processed in the parameter domain or thelinear domain.
 29. The method according to claim 28, further comprising:decoding the extracted parameterised coded digitised analogue signalsfor processing in the linear domain; and encoding the extractedparameterised coded digitised analogue signals after processing in thelinear domain to form parameterised coded digitised analogue signalsagain.
 30. The method according to claim 29, further comprising:transforming the decoded extracted parameterised coded digitisedanalogue signals to form non-parameterised coded digitised analoguesignals; and inserting the non-parameterised coded digitised analoguesignals into the digital network.
 31. The method according to claim 28,further comprising: forming non-parameterised coded digitised analoguesignals corresponding to the extracted parameterised coded digitisedanalogue signals; and processing the non-parameterised coded digitisedanalogue signals in the linear domain, wherein processing the extractedparameterised coded digitised analogue signals comprises processing inthe parameter domain if a greater quality improvement is expected byprocessing in the parameter domain, wherein inserting comprisesinserting the processed extracted non-parameterised coded digitisedanalogue signals into the digital network again, and wherein insertingfurther comprises inserting the processed extracted parameterised codeddigitised analogue signals into the digital network again if a greaterquality improvement is expected by processing in the parameter domain.32. The method according to claim 28, wherein the quality improvement ofa processing in the linear and in the parameter domain is determined byanalysing the extracted parameterised coded digitised analogue signalbefore and after processing in the linear and in the parameter domain.33. The method according to claim 28, wherein the quality improvement ofthe processing in the linear domain and the processing in the parameterdomain is determined using a neural network.
 34. The method according toclaim 28, further comprising: selecting processing functions that aresuitable for an enhancement of the quality of the extractedparameterised coded digitised analogue signals; and performing onlythose processing functions.
 35. The method according to claim 28,wherein the processing in the parameter domain comprises formingcorresponding gain parameters for a gain control by comparing gainparameters of the extracted parameterised coded digitised analoguesignals with a desired gain, and replacing the gain parameters with thecorresponding gain parameters in the extracted parameterised codeddigitised analogue signals.
 36. The method according to claim 35,further comprising: linearising extracted parameterised coded digitisedanalogue signals; providing decoded gain parameters of the digitisedanalogue signals; estimating a signal level of the linearised codeddigitised analogue signals; determining desired gain values based on theestimated signal level and a desired target signal level; determiningout of the decoded gain parameters of the coded digitised analoguesignals and the desired gain values, new gain parameters suitable forachieving a desired gain by linear-to-parameter domain mapping; andre-quantising the new gain parameters and replacing original gainparameters with the new gain parameters in the coded digitised analoguesignals.
 37. The method according to claim 28, wherein the processing inthe parameter domain comprises attenuating noise portions and low levelsignal portions of the extracted parameterised coded digitised analoguesignals for noise suppression in the time domain, and forcorrespondingly replacing gain parameters in the extracted parameterisedcoded digitised analogue signals.
 38. The method according to claim 28,wherein the processing in the parameter domain comprises attenuatingfrequency portions of noise for noise suppression in the extractedparameterised coded digitised analogue signals which have which haveapproximately the same energy as a noise estimate and forcorrespondingly replacing linear prediction parameters in the extractedparameterised coded digitised analogue signals.
 39. The method accordingto claim 38, further comprising: decoding linear prediction coefficientsfrom extracted coded digitised analogue signals; estimating a long-termpower spectrum of the noise of the digitised analogue signals;estimating a short-term power spectrum of the noise of the digitisedanalogue signals; determining a desired spectrum based upon a differencebetween the long-term spectrum and the short-term spectrum; determiningnew linear prediction coefficients according to the desired spectrum;and quantising the new linear prediction coefficients parameters orconverting them to line spectral pairs parameters and replacing them forthe old parameters in the parameterised coded digitised analogue signal.40. The method according to claim 28, wherein the processing in theparameter domain comprises extracting further parameterised codeddigitised analogue signals transmitted in the opposite direction forecho suppression, comparing the first extracted and the furtherextracted parameterised coded digitised analogue signals to detectechoes in the first extracted parameterised coded digitised analoguesignals, and replacing portions of the first extracted parameterisedcoded digitised analogue signal with generated portions of comfort noiseparameters, if an echo was determined in a portion of a first extractedparameterised coded digitised analogue signal.
 41. The method accordingto claim 40, further comprising: linearising the extracted codeddigitised analogue signals transmitted from a first direction and anopposite direction before comparing them.
 42. The method according toclaim 28, further comprising: extracting further coded digitisedanalogue signals from the digital network transmitted in the oppositedirection to the extracted coded digitised analogue signals, wherein thefurther coded digitised analogue signals comprise at least in partparameterised coded digitised analogue signals, attenuating an echosignal in first extracted parameterised coded digitised analogue signalsmaking use of further parameterised coded digitised analogue signals,and suppressing the residual echo signal.
 43. The method according toclaim 42, further comprising: linearising the extracted coded digitisedanalogue signals transmitted in a first direction and the oppositedirection before attenuating the echo signal, generating comfort noisebased on the result of the suppression and an estimated noise in a firstextracted digitised analogue signal and replacing a portion of anoriginal first extracted digitised analogue signal in which an echo wasdetected with a portion comprising a corresponding comfort noiseparameter.
 44. The method according to claim 40, wherein the processingin the parameter domain comprises by-passing the first extractedparameterised coded digitised analogue signals without echo detection,if there is no signal activity in the opposite direction or if thesignal level of the extracted parameterised coded digitised analoguesignals is below a threshold level in the opposite direction.
 45. Acomputer readable storage medium encoded with instructions that, whenexecuted by a computer, perform a process, the process comprising:extracting coded digitised analogue signals from a digital network,wherein the coded digitised analogue signals comprise parameterisedcoded digitised analogue signals; determining a quality improvement ofthe coded digitised analogue signals to be expected by a processing ofthe extracted coded digitised analogue signals in a parameter domain andby a processing of the extracted coded digitised analogue signals in alinear domain; processing the extracted parameterised coded digitisedanalogue signals in the parameter domain if a greater qualityimprovement is expected by the processing in the parameter domain withfunctions suitable for enhancing the quality of digitised analoguesignals; processing the extracted parameterised coded digitised analoguesignals in the linear domain if a greater quality improvement isexpected by processing in the linear domain with functions suitable forenhancing the quality of digitised analogue signals; and inserting theprocessed parameterised coded digitised analogue signals into thedigital network that were processed in the parameter domain or thelinear domain.